skip to main content


Search for: All records

Creators/Authors contains: "Zhang, Zihan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. From higher computational efficiency to enabling the discovery of novel and complex structures, deep learning has emerged as a powerful framework for the design and optimization of nanophotonic circuits and components. However, both data-driven and exploration-based machine learning strategies have limitations in their effectiveness for nanophotonic inverse design. Supervised machine learning approaches require large quantities of training data to produce high-performance models and have difficulty generalizing beyond training data given the complexity of the design space. Unsupervised and reinforcement learning-based approaches on the other hand can have very lengthy training or optimization times associated with them. Here we demonstrate a hybrid supervised learning and reinforcement learning approach to the inverse design of nanophotonic structures and show this approach can reduce training data dependence, improve the generalizability of model predictions, and significantly shorten exploratory training times. The presented strategy thus addresses several contemporary deep learning-based challenges, while opening the door for new design methodologies that leverage multiple classes of machine learning algorithms to produce more effective and practical solutions for photonic design.

     
    more » « less
  2. We study model-free reinforcement learning (RL) algorithms for infinite-horizon average-reward Markov decision process (MDP), which is more appropriate for applications that involve continuing operations not divided into episodes. In contrast to episodic/discounted MDPs, theoretical understanding of model-free RL algorithms is relatively inadequate for the average-reward setting. In this paper, we consider both the online setting and the setting with access to a simulator. We develop computationally efficient model-free algorithms that achieve sharper guarantees on regret/sample complexity compared with existing results. In the online setting, we design an algorithm, UCB-AVG, based on an optimistic variant of variance-reduced Q-learning. We show that UCB-AVG achieves a regret bound $\widetilde{O}(S^5A^2sp(h^*)\sqrt{T})$ after $T$ steps, where $S\times A$ is the size of state-action space, and $sp(h^*)$ the span of the optimal bias function. Our result provides the first computationally efficient model-free algorithm that achieves the optimal dependence in $T$ (up to log factors) for weakly communicating MDPs, which is necessary for low regret. In contrast, prior results either are suboptimal in $T$ or require strong assumptions of ergodicity or uniformly mixing of MDPs. In the simulator setting, we adapt the idea of UCB-AVG to develop a model-free algorithm that finds an $\epsilon$-optimal policy with sample complexity $\widetilde{O}(SAsp^2(h^*)\epsilon^{-2} + S^2Asp(h^*)\epsilon^{-1}).$ This sample complexity is near-optimal for weakly communicating MDPs, in view of the minimax lower bound $\Omega(SAsp(^*)\epsilon^{-2})$. Existing work mainly focuses on ergodic MDPs and the results typically depend on $t_{mix},$ the worst-case mixing time induced by a policy. We remark that the diameter $D$ and mixing time $t_{mix}$ are both lower bounded by $sp(h^*)$, and $t_{mix}$ can be arbitrarily large for certain MDPs. On the technical side, our approach integrates two key ideas: learning an $\gamma$-discounted MDP as an approximation, and leveraging reference-advantage decomposition for variance in optimistic Q-learning. As recognized in prior work, a naive approximation by discounted MDPs results in suboptimal guarantees. A distinguishing feature of our method is maintaining estimates of value-difference between state pairs to provide a sharper bound on the variance of reference advantage. We also crucially use a careful choice of the discounted factor $\gamma$ to balance approximation error due to discounting and the statistical learning error, and we are able to maintain a good-quality reference value function with $O(SA)$ space complexity. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  3. Abstract Electroluminescence efficiencies and stabilities of quasi-two-dimensional halide perovskites are restricted by the formation of multiple-quantum-well structures with broad and uncontrollable phase distributions. Here, we report a ligand design strategy to substantially suppress diffusion-limited phase disproportionation, thereby enabling better phase control. We demonstrate that extending the π-conjugation length and increasing the cross-sectional area of the ligand enables perovskite thin films with dramatically suppressed ion transport, narrowed phase distributions, reduced defect densities, and enhanced radiative recombination efficiencies. Consequently, we achieved efficient and stable deep-red light-emitting diodes with a peak external quantum efficiency of 26.3% (average 22.9% among 70 devices and cross-checked) and a half-life of ~220 and 2.8 h under a constant current density of 0.1 and 12 mA/cm 2 , respectively. Our devices also exhibit wide wavelength tunability and improved spectral and phase stability compared with existing perovskite light-emitting diodes. These discoveries provide critical insights into the molecular design and crystallization kinetics of low-dimensional perovskite semiconductors for light-emitting devices. 
    more » « less
    Free, publicly-accessible full text available December 1, 2024
  4. Phase separation plays crucial roles in both sustaining cellular function and perpetuating disease states. Despite extensive studies, our understanding of this process is hindered by low solubility of phase-separating proteins. One example of this is found in SR and SR-related proteins. These proteins are characterized by domains rich in arginine and serine (RS domains), which are essential to alternative splicing and in vivo phase separation. However, they are also responsible for a low solubility that has made these proteins difficult to study for decades. Here, we solubilize the founding member of the SR family, SRSF1, by introducing a peptide mimicking RS repeats as a co-solute. We find that this RS-mimic peptide forms interactions similar to those of the protein’s RS domain. Both interact with a combination of surface-exposed aromatic residues and acidic residues on SRSF1’s RNA Recognition Motifs (RRMs) through electrostatic and cation-pi interactions. Analysis of RRM domains from human SR proteins indicates that these sites are conserved across the protein family. In addition to opening an avenue to previously unavailable proteins, our work provides insight into how SR proteins phase separate and participate in nuclear speckles. 
    more » « less
  5. Abstract

    Immune cells degrade internalized pathogens in phagosomes through sequential biochemical changes. The degradation must be fast enough for effective infection control. The presumption is that each phagosome degrades cargos autonomously with a distinct but stochastic kinetic rate. However, here we show that the degradation kinetics of individual phagosomes is not stochastic but coupled to their intracellular motility. By engineering RotSensors that are optically anisotropic, magnetic responsive, and fluorogenic in response to degradation activities in phagosomes, we monitored cargo degradation kinetics in single phagosomes simultaneously with their translational and rotational dynamics. We show that phagosomes that move faster centripetally are more likely to encounter and fuse with lysosomes, thereby acidifying faster and degrading cargos more efficiently. The degradation rates increase nearly linearly with the translational and rotational velocities of phagosomes. Our results indicate that the centripetal motion of phagosomes functions as a clock for controlling the progression of cargo degradation.

     
    more » « less
  6. null (Ed.)